High Dimensional Feature Indexing Using Hybrid Trees

نویسندگان

  • Kaushik Chakrabarti
  • Sharad Mehrotra
چکیده

Feature based similarity search is emerging as an important search paradigm in database systems. The technique used is to map the data items as points into a high dimensional feature space which is indexed using a multidimensional data structure. Similarity search then corresponds to a range search over the data structure. Traditional multidimensional data structures (e.g., R-tree, KDB-tree, grid les) are of limited use for feature indexing since (1), their performance deteriorates rapidly with the increase in the dimensionality of the feature space(referred to as the \dimensionality curse") and (2), they do not support range queries based on arbitrary distance functions, a situation that occurs commonly in multimedia feature spaces. This paper introduces the hybrid tree { a multidimensional data structure for indexing high dimensional feature spaces. The hybrid tree combines positive aspects of bounding region (BR)-based data structures (e.g., Rtree, SS-tree, SR-tree) and space partitioning (SP) data structures (e.g., KDB-tree, hB-tree) into a single data structure to achieve search performance more scalable to high dimensionalities than either of the above techniques. Furthermore, the hybrid tree supports range queries based on arbitrary distance functions. Our experiments on \real" high dimensional large size feature databases demonstrate that the hybrid tree scales well to high dimensionality and large database sizes. It signi cantly outperforms both purely BR-based and SP-based index mechanisms as well as linear scan at all dimensionalities for large sized databases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

یک روش مبتنی بر خوشه‌بندی سلسله‌مراتبی تقسیم‌کننده جهت شاخص‌گذاری اطلاعات تصویری

It is conventional to use multi-dimensional indexing structures to accelerate search operations in content-based image retrieval systems. Many efforts have been done in order to develop multi-dimensional indexing structures so far. In most practical applications of image retrieval, high-dimensional feature vectors are required, but current multi-dimensional indexing structures lose their effici...

متن کامل

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces

Feature based similarity search is emerging as an important search paradigm in database systems. The technique used is to map the data items as points into a high dimensional feature space which is indexed using a multidimensional data structure. Similarity search then corresponds to a range search over the data structure. Although several data structures have been proposed for feature indexing...

متن کامل

High-Dimensional Indexing for Multimedia Features

Efficient content-based similarity search in large multimedia databases requires efficient query processing algorithms for many practical applications. Especially in high-dimensional spaces, the huge number of features is a challenge to existing indexing structures. Due to increasing overlap with growing dimensionality, they eventually fail to deliver runtime improvements. In this work, we prop...

متن کامل

Determining Effective Features for Face Detection Using a Hybrid Feature Approach

Detecting faces in cluttered backgrounds and real world has remained as an unsolved problem yet. In this paper, by using composition of some kind of independent features and one of the most common appearance based approaches, and multilayered perceptron (MLP) neural networks, not only some questions have been answered, but also the designed system achieved better performance rather than the pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999